117 research outputs found

    Cybersecurity of Railway Command and Control Systems

    Get PDF
    With the large-scale migration to computer-based and network technology, the threat of unauthorized remote access to railway command and control systems does not appear to be something extraordinary.But external effects shall be considered alongside with internal factorsof signalling software and hardware such errors and undocumented features. Risk mitigation in terms of cybersecurity of signalling installations can onlybe achieved as a combination of means designed within some holistic approach integrating both safety and IT security aspects

    Audio style transfer

    Full text link
    'Style transfer' among images has recently emerged as a very active research topic, fuelled by the power of convolution neural networks (CNNs), and has become fast a very popular technology in social media. This paper investigates the analogous problem in the audio domain: How to transfer the style of a reference audio signal to a target audio content? We propose a flexible framework for the task, which uses a sound texture model to extract statistics characterizing the reference audio style, followed by an optimization-based audio texture synthesis to modify the target content. In contrast to mainstream optimization-based visual transfer method, the proposed process is initialized by the target content instead of random noise and the optimized loss is only about texture, not structure. These differences proved key for audio style transfer in our experiments. In order to extract features of interest, we investigate different architectures, whether pre-trained on other tasks, as done in image style transfer, or engineered based on the human auditory system. Experimental results on different types of audio signal confirm the potential of the proposed approach.Comment: ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2018, Calgary, France. IEE

    Optimal parameter estimation for model-based quantization

    Full text link
    We address optimal model estimation for model-based vector quan-tization for both the constrained resolution (CR) and constrained en-tropy (CE) cases. To this purpose we derive under high-rate (HR) theory assumptions the rate-distortion (RD) relations for these two quantization scenarios assuming a Gaussian model. Based on the RD relations we show that the maximum likelihood (ML) criterion leads to optimal performance for CE quantization, but not for CR quantization. We introduce a new model estimation criterion for CR quantization that is optimal (under HR theory assumptions) in terms of the RD relation. Our experiments confirm that the proposed cri-terion for model identification outperforms the ML criterion for a range of conditions. Index Terms — Constrained resolution, model-based quantiza-tion, model estimation, rate-distortion relation, high-rate theory

    GMM-based classification from noisy features

    Get PDF
    International audienceWe consider Gaussian mixture model (GMM)-based classification from noisy features, where the uncertainty over each feature is represented by a Gaussian distribution. For that purpose, we first propose a new GMM training and decoding criterion called log-likelihood integration which, as opposed to the conventional likelihood integration criterion, does not rely on any assumption regarding the distribution of the data. Secondly, we introduce two new Expectation Maximization (EM) algorithms for the two criteria, that allow to learn GMMs directly from noisy features. We then evaluate and compare the behaviors of two proposed algorithms with a categorization task on artificial data and speech data with additive artificial noise, assuming the uncertainty parameters are known. Experiments demonstrate the superiority of the likelihood integration criterion with the newly proposed EM learning in all tested configurations, thus giving rise to a new family of learning approaches that are insensitive to the heterogeneity of the noise characteristics between testing and training data

    Text-informed audio source separation. Example-based approach using non-negative matrix partial co-factorization

    Get PDF
    International audienceThe so-called informed audio source separation, where the separation process is guided by some auxiliary information, has recently attracted a lot of research interest since classical blind or non-informed approaches often do not lead to satisfactory performances in many practical applications. In this paper we present a novel text-informed framework in which a target speech source can be separated from the background in the mixture using the corresponding textual information. First, given the text, we propose to produce a speech example via either a speech synthesizer or a human. We then use this example to guide source separation and, for that purpose, we introduce a new variant of the non-negative matrix partial co-factorization (NMPCF) model based on a so-called excitation-filter-channel speech model. Such a modeling allows sharing the linguistic information between the speech example and the speech in the mixture. The corresponding multiplicative update (MU) rules are eventually derived for the parameters estimation and several extensions of the model are proposed and investigated. We perform extensive experiments to assess the effectiveness of the proposed approach in terms of source separation and alignment performance

    Multi-source TDOA estimation in reverberant audio using angular spectra and clustering

    Get PDF
    In this article, we consider the problem of estimating the time differences of arrival (TDOAs) of multiple sources from two-channel reverberant audio mixtures. This is commonly achieved using clustering or angular spectrum-based methods. These methods are limited in that they typically affect the same weight to the spatial information provided by all time-frequency bins and rely on a binary activation model of the sources. Moreover, few experimental comparisons of different methods have been carried out so far. We introduce two new groups of TDOA estimation methods. First, we propose a time-frequency weighting procedure based on a form of signal-to-noise-ratio (SNR) that was shown to be efficient for instantaneous mixtures. Second, we introduce new clustering algorithms based on the assumption that all sources can be active in each time-frequency bin. We also study a two-step procedure combining angular spectra and clustering and conduct a large-scale experimental evaluation of the proposed and existing methods. The best average localization performance is achieved by a variant of the generalized cross-correlation with phase transform (GCC-PHAT) method without subsequent clustering. Moreover, one of the SNR-based methods we propose outperforms this method for small microphone spacing.Dans cet article, nous considérons le problème d'estimation des différences de temps d'arrivée (TDOAs) de plusieurs sources sonores dans un enregistrement stéréophonique en environnement réverbérant. Ce problème est communément traité par des méthodes de type clustering ou spectre angulaire. Ces méthodes sont limitées par le fait qu'elle affectent typiquement le même poids à l'information spatiale issue de tous les points temps-fréquence et qu'elles se basent sur un modèle binaire d'activation des sources. De plus, peu de comparaisons expérimentales ont été effectuées jusqu'à présent. Premièrement, nous proposons une procédure de pondération temps-fréquence basée sur une forme de rapport signal-à-bruit (RSB) dont l'efficacité a été montrée pour des mélanges instantanés. Deuxièmement, nous introduisons de nouveaux algorithmes de clustering basés sur l'hypothèse que toutes les sources peuvent être actives en chaque point temps-fréquence. Nous étudions également une procédure en deux étapes combinant le spectre angulaire et le clustering et nous menons une évaluation expérimentale à grande échelle des méthodes proposées et existantes. En moyenne, les meilleures performances de localisation ont été obtenues par une version de GCC-PHAT (Generalized Cross Correlation with Phase Transform) sans avoir recours au clustering. De plus, une des méthodes basées sur le RSB que nous proposons se révèle plus performante que cette dernière lorsque la distance entre les microphones est petite
    corecore